How Far Are We From (Semi-)Automatic Of Anaphoric Links In Corpora?

نویسنده

  • Ruslan Mitkov
چکیده

The paper raises for discussion a proposal for the semi-automatic annotation of pronoun-antecedent pairs in corpora. The proposal is based on robust knowledge-poor pronoun resolution followed by post-editing. The paper is structured as follows. The introduction comments on the fact that automatic identification of referential links in corpora has lagged behind in comparison with similar lexical, syntactical and even semantic tasks. The second section of the paper outlines the author's practical and robust knowledge-based approach to pronoun resolution which will subsequently be put forward as the core of a larger architecture proposed for the automatic tagging of referential links. Section 3 briefly presents other related knowledge-poor approaches, while section 4 discusses the limitations and advantages of the practical approach. The main argument of the paper is to be found in section 5, where we present the idea of developing a semi-automatic environment for annotating anaphoric links and outline the components of such a program. Finally, the conclusion looks at the anticipated success rate of the approach.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Arabic anaphora resolution: corpora annotation with coreferential links

Annotated resources are much needed for evaluation and training of anaphora resolution systems. The coreferential chain annotation is a difficult task which can not be realised without an appropriate tool. In this paper, we present our work on Arabic corpora annotation with anaphoric links (i.e., the annotation of the identity relation between the anaphors and their antecedents). In particular,...

متن کامل

Building annotated resources for automatic text summarisation

Annotated corpora are necessary for automatic summarisation, but given how difficult is to produce them there are only few available. This paper presents an annotation tool which helps the human annotator to select the important units from a text. In addition to the tool, a new annotation scheme is proposed so that phenomena which such as presence of anaphoric expressions and redundancy can be ...

متن کامل

Comparison of Annotating Methods for Named Entity Corpora

We compared two methods to annotate a corpus via non-expert annotators for named entity (NE) recognition task, which are (1) revising the results of the existing NE recognizer and (2) annotating NEs only by hand. We investigated the annotation time, the degrees of agreement, and the performances based on the gold standard. As we have two annotators for one file of each method, we evaluated the ...

متن کامل

Automatic measurement of instantaneous changes in the walls of carotid artery with sequential ultrasound images

Introduction: This study presents a computerized analyzing method for detection of instantaneous changes of far and near walls of the common carotid artery in sequential ultrasound images by applying the maximum gradient algorithm. Maximum gradient was modified and some characteristics were added from the dynamic programming algorithm for our applications. Methods: The algorithm was evaluat...

متن کامل

How Far Behind Are the South Asian Countries in Relation to East Asian

We define as South Asian countries those countries that start with Iran and end with Bangladesh in Asia. Wethen use export statistics in terms of revealed comparative advantage (RCA) for 14 industrial sectors tomeasure distances of export capabilities for these countries in relation to the “Western” developed and EastAsian countries. Statistical methods such as multidimensional scaling and ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1997